NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Simulating urban energy use under climate change scenarios and retrofit plans in coastal Texas

https://doi.org/10.1007/s44212-024-00046-8

Zhu, Chunwu; Ye, Xinyue; Du, Jiaxin; Hu, Zhiheng; Shen, Yang; Retchless, David (June 2024, Urban Informatics)

Abstract Rapid urbanization, climate change, and aging infrastructure pose significant challenges to achieving sustainability and resilience goals in urban building energy use. Although retrofitting offers a viable solution to mitigate building energy use, there has been limited analysis of its effects under various weather conditions associated with climate change in urban building energy use simulations. Moreover, certain parameters in energy simulations necessitate extensive auditing or survey work, which is often impractical. This research proposes a framework that integrates various datasets, including building footprints, Lidar data, property appraisals, and street view images, to conduct neighborhood-scale building energy use analysis using the Urban Modeling Interface (UMI), an Urban Building Energy Model (UBEM), in a coastal neighborhood in Galveston, Texas. Seven retrofit plans and three weather conditions are considered in the scenarios of building energy use. The results show that decreasing the U-value of building envelopes helps reduce energy use, while increasing the U-value leads to higher energy consumption in the Galveston neighborhood. This finding provides direction for coastal Texas cities, like Galveston, to update building standards and implement retrofit measures.
more » « less
Multi-Modal Contrastive Learning for Proteins by Combining Domain-Informed Views

Xu, Haotian; You, Yuning; Shen, Yang (March 2024, Machine Learning for Genomics Explorations workshop at ICLR 2024)

Proteins, often represented as multi-modal data of 1D sequences and 2D/3D structures, provide a motivating example for the communities of machine learning and computational biology to advance multi-modal representation learning. Protein language models over sequences and geometric deep learning over structures learn excellent single-modality representations for downstream tasks. It is thus desirable to fuse the single-modality models for better representation learning, but it remains an open question on how to fuse them effectively into multi-modal representation learning with a modest computational cost yet significant downstream performance gain. To answer the question, we propose to make use of separately pretrained single-modality models, integrate them in parallel connections, and continuously pretrain them end-to-end under the framework of multimodal contrastive learning. The technical challenge is to construct views for both intra- and inter-modality contrasts while addressing the heterogeneity of various modalities, particularly various levels of semantic robustness. We address the challenge by using domain knowledge of protein homology to inform the design of positive views, specifically protein classifications of families (based on similarities in sequences) and superfamilies (based on similarities in structures). We also assess the use of such views compared to, together with, and composed to other positive views such as identity and cropping. Extensive experiments on enzyme classification and protein function prediction benchmarks demonstrate the potential of domain-informed view construction and combination in multi-modal contrastive learning
more » « less
Full Text Available
Reducing Catastrophic Forgetting With Associative Learning: A Lesson From Fruit Flies

https://doi.org/10.1162/neco_a_01615

Shen, Yang; Dasgupta, Sanjoy; Navlakha, Saket (October 2023, Neural Computation)

Abstract Catastrophic forgetting remains an outstanding challenge in continual learning. Recently, methods inspired by the brain, such as continual representation learning and memory replay, have been used to combat catastrophic forgetting. Associative learning (retaining associations between inputs and outputs, even after good representations are learned) plays an important function in the brain; however, its role in continual learning has not been carefully studied. Here, we identified a two-layer neural circuit in the fruit fly olfactory system that performs continual associative learning between odors and their associated valences. In the first layer, inputs (odors) are encoded using sparse, high-dimensional representations, which reduces memory interference by activating nonoverlapping populations of neurons for different odors. In the second layer, only the synapses between odor-activated neurons and the odor’s associated output neuron are modified during learning; the rest of the weights are frozen to prevent unrelated memories from being overwritten. We prove theoretically that these two perceptron-like layers help reduce catastrophic forgetting compared to the original perceptron algorithm, under continual learning. We then show empirically on benchmark data sets that this simple and lightweight architecture outperforms other popular neural-inspired algorithms when also using a two-layer feedforward architecture. Overall, fruit flies evolved an efficient continual associative learning algorithm, and circuit mechanisms from neuroscience can be translated to improve machine computation.
more » « less
Full Text Available
Assessing the predicted impact of single amino acid substitutions in MAPK proteins for CAGI6 challenges

https://doi.org/10.1007/s00439-024-02724-8

Turina, Paola; Petrosino, Maria; Enriquez_Sandoval, Carlos A; Novak, Leonore; Pasquo, Alessandra; Alexov, Emil; Alladin, Muttaqi Ahmad; Ascher, David B; Babbi, Giulia; Bakolitsa, Constantina; et al (March 2025, Human Genetics)

Free, publicly-accessible full text available March 1, 2026
Hydrovolcanic Explosions at the Lava Ocean Entry of the 2018 Kilauea Eruption Recorded by Ocean-Bottom Seismometers

https://doi.org/10.1785/0220220195

Banerjee, Puja; Shen, Yang (March 2023, Seismological Research Letters)

Abstract From the beginning of May 2018, the Kilauea Volcano on the island of Hawaii experienced its largest eruption in 200 yr followed by a period of unrest for months. Because hot molten lava entered the ocean from the ocean-entry point near the lower East Rift Zone, the lava–water interaction led to explosions. Some explosions were near the water surface and ejected fragments of lava, also known as lava bombs. In the early morning on 16 July 2018, one of those lava bombs, which was almost the size of a basketball, hit a sightseeing boat and injured 23 people. In this study, we analyzed the hydrophone data recorded from July to mid-September by ocean-bottom seismometers (OBSs) deployed offshore near the ocean entry point to identify and locate the hydroacoustic signals of the lava–water explosions. Acoustic signals of hydrovolcanic explosions are characterized by a short duration (less than a few seconds) and a broad frequency range (at least up to 100 Hz). To automate event detection, a short-term average versus long-term average method was applied to the complete dataset. Approximately 4300 events were detected and located near the coastline and further used to prepare a catalog. The distribution of the lava–water explosions is consistent with the pattern of the offshore lava delta formed during the 2018 eruption. Identifying such hydroacoustic signals recorded by OBSs may provide new avenues of research using various seismoacoustic events associated with volcanic eruptions.
more » « less
Full Text Available
A shallow slow slip event in 2018 in the Semidi segment of the Alaska subduction zone detected by machine learning

https://doi.org/10.1016/j.epsl.2023.118154

He, Bing; Wei, XiaoZhuo; Wei, Meng; Shen, Yang; Alvarez, Marco; Schwartz, Susan Y. (June 2023, Earth and Planetary Science Letters)

Full Text Available
Augmentations in Hypergraph Contrastive Learning: Fabricated and Generative

Wei, Tianxin; You, Yuning; Chen, Tianlong; Shen, Yang; He, Jingrui; Wang, Zhangyang (May 2023, NeurIPS)

Full Text Available
Somatic estrogen receptor α mutations that induce dimerization promote receptor activity and breast cancer proliferation

https://doi.org/10.1172/JCI163242

Irani, Seema; Tan, Wuwei; Li, Qing; Toy, Weiyi; Jones, Catherine; Gadiya, Mayur; Marra, Antonio; Katzenellenbogen, John A; Carlson, Kathryn E; Katzenellenbogen, Benita S; et al (January 2024, Journal of Clinical Investigation)

Full Text Available
Cross-modality and self-supervised protein embedding for compound–protein affinity and contact prediction

https://doi.org/10.1093/bioinformatics/btac470

You, Yuning; Shen, Yang (September 2022, Bioinformatics)

Abstract MotivationComputational methods for compound–protein affinity and contact (CPAC) prediction aim at facilitating rational drug discovery by simultaneous prediction of the strength and the pattern of compound–protein interactions. Although the desired outputs are highly structure-dependent, the lack of protein structures often makes structure-free methods rely on protein sequence inputs alone. The scarcity of compound–protein pairs with affinity and contact labels further limits the accuracy and the generalizability of CPAC models. ResultsTo overcome the aforementioned challenges of structure naivety and labeled-data scarcity, we introduce cross-modality and self-supervised learning, respectively, for structure-aware and task-relevant protein embedding. Specifically, protein data are available in both modalities of 1D amino-acid sequences and predicted 2D contact maps that are separately embedded with recurrent and graph neural networks, respectively, as well as jointly embedded with two cross-modality schemes. Furthermore, both protein modalities are pre-trained under various self-supervised learning strategies, by leveraging massive amount of unlabeled protein data. Our results indicate that individual protein modalities differ in their strengths of predicting affinities or contacts. Proper cross-modality protein embedding combined with self-supervised learning improves model generalizability when predicting both affinities and contacts for unseen proteins. Availability and implementationData and source codes are available at https://github.com/Shen-Lab/CPAC. Supplementary informationSupplementary data are available at Bioinformatics online.
more » « less
Comment on “Seismic Velocity Variations at Different Depths Reveal the Dynamic Evolution Associated With the 2018 Kilauea Eruption” by Liu et al.

https://doi.org/10.1029/2022GL102596

Wei, XiaoZhuo; Shen, Yang (September 2023, Geophysical Research Letters)

Abstract Liu et al. (2022,https://doi.org/10.1029/2021GL093691) used Rayleigh waves extracted from the cross‐correlation of ambient noise recorded by two stations to monitor the seismic velocity variations associated with the 2018 Kı̄lauea eruption. However, their study ignored the fact that the tremors on the Island of Hawai'i were dominated by a source at the Kı̄lauea summit before the eruption. Close inspection of the waveforms of the station pair PAUD‐STCD shows a simple, mistakenly identified wave traveling direction in Liu et al. (2022,https://doi.org/10.1029/2021GL093691). A correct wave traveling direction agrees with the noise source model, where the dominant tremor source should be at the Kı̄lauea summit. Because of the drastic change in the tremor source after the eruption, the cross‐correlation of the tremor records may reflect predominantly changes in the source rather than in the medium properties between the two stations.
more » « less

« Prev Next »

Search for: All records